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Adaptive prediction schemes provide lower transmission rates than 
those obtained by simple previous frame prediction. In this paper, we 
measure the entropy of prediction errors for two types of adaptive 
intra- inter frame prediction algorithms. In the first case, that predic- 
tor which results in the least prediction error for previously trans- 
mitted neighboring pels is selected from a set of predictor functions. 
In the second case, prediction is a weighted sum of previous frame 
and intraframe predictions, where the weights are changed from pel 
to pel by gradient techniques. We also investigate various modifica- 
tions of the basic methods. Further, a new type of variable length 
encoding in which the locations of the nonzero prediction errors are 
coded by horizontal run lengths is discussed. Compared with the pel 
entropy of previous frame prediction, the run length coding gives a 
gain of 2 to 16 percent, depending on the scene. Compared to simple 
previous frame prediction the first type of adaptive scheme in com- 
bination with horizontal run length coding provides a gain in entropy 
of 18 to 29 percent, whereas the second type of adaptive scheme 
provides a gain of 20 to 32 percent. 

I. INTRODUCTION 

The bit rate required for digital transmission of television pictures 
can be significantly reduced by interframe dpcm encoding. The coding 
method which has been widely proposed for video-telephone and 
video-conference application is conditional replenishment. 1,2 In condi- 
tional replenishment, each frame of a television sequence is segmented 
into changed and unchanged areas. Various methods can be used for 
encoding the changed parts of a frame. Intraframe predictive coding is 
very efficient for these parts." In conditional replenishment, no infor- 
mation about the unchanged areas is transmitted. At the receiver, the 
unchanged areas are reconstructed by repeating from the previous 
frame. However, it is necessary to transmit address information that 
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indicates the location of the changed areas. Several modifications and 
improvements of the basic method of conditional replenishment have 
been made. Most by them are described in a survey by Haskell. 5 

This paper describes adaptive intra-interframe prediction. It is ob- 
vious that stationary background of a frame is best predicted from a 
pel in the previous frame which has the same position as the pel to be 
predicted, whereas parts of a frame with moving objects are better 
predicted by an intraframe predictor. Therefore, a prediction scheme 
which provides automatic switching between the two types of predic- 
tors, depending upon the part of the picture, will result in lower bit 
rates. To avoid the transmission of additional predictor control infor- 
mation, the adaptive prediction schemes described here are based on 
previously transmitted reconstructed pels. Further, no forward seg- 
menter like that of conditional replenishment is used. Therefore, only 
the quantized prediction error has to be coded and transmitted. 

A block diagram of such a dpcm encoder with adaptive prediction is 
shown in Fig. 1. The investigations in this paper concern a comparison 
of the performance of two types of adaptive predictors. The first one, 
denoted by predictor selection, is a scheme where one predictor is 
selected from a set of predictors. In the second scheme, the predictor 
is a weighted sum of predictors and the prediction coefficients are 
changed continuously by a gradient algorithm. As a measure of pre- 
dictor performance, the entropy of the quantized prediction error is 
used. For three different television scenes an estimate of the entropy 
is obtained from dpcm simulations. The necessary measures against 
buffer overflow and underflow, in case of variable length encoding, 
have not been considered here. 

This paper is organized as follows. Section II gives a detailed 
description of the two basic algorithms and their modifications. Section 
III describes a variable length encoding scheme which is especially 
suited for dpcm coders that have improved prediction. Results of 
simulations on real scenes are given in Section IV. 
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Fig. 1 — Block diagram of a dpcm coder with adaptive predictor. 
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II. DESCRIPTION OF THE PREDICTION ALGORITHMS 

Let fi be one of M predictor functions. If each /i is a linear predictor 
function, then 

N 
fi-Z OijS'j, (1) 

/-J 

where Op are the weighting coefficients and 8/ are previously transmit- 
ted pels. The prime in Sj indicates that these are reconstructed pels 
which are known at the receiver. The subscripting for pels neighboring 
the present pel s is shown in Fig. 2. The predictor functions ft, i = 1, 
2, • • • M, are linear combinations of N pels, sj, j = 1, 2, • • • N, which 
form a vector 

si 



s = 



82 



8n 



In vector notation, equation (1) can be written as 

/; = a,v. 



(2) 



(3) 



Here the superscript T denotes the transpose of a vector or matrix, 
and a, is the vector formed by the coefficients a,ij,j = 1, 2, • • • N. The 
prediction value s is a weighted sum of all predictor functions, 

M 

so = 2 bifi. (4) 

i-i 

If f denotes the vector of elements /j, i = 1, 2, • • • M , and b denotes the 
vector of elements bi, i = 1, 2, • • • M, then 

so = b T f . (5) 



^-^3^ 

?Y^Y^ — 




PRESENT FRAME 



PREVIOUS FRAME 



Fig. 2 — Configuration and subscripting of picture elements. Pel s is the present pel 
to be predicted. Dotted lines denote scan lines from previous fields. 
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This description is general and includes switched prediction by allow- 
ing special values of b, such as b k = 1 and 6, = for all i ^ k. Combining 
(3) and (5) it follows that 

f=As' (6) 

so = b T As', (7) 

where 



A = 



T 

T 
&2 



' T 



(8) 



is an M X N matrix. The set of predictor functions is described by the 
matrix A, with the coefficient vectors a, chosen such that a particular 
predictor function provides a good prediction for a specific area of a 
television scene, like stationary background, moving objects, etc. The 
algorithm then seeks to automatically adapt the vector b to various 
areas of a scene so as to minimize the prediction error. 

In this investigation, the set of predictor functions is restricted to a 
previous frame predictor 

h = S20 (9) 

and an intraframe predictor 

f 2 = a 2 isi + (I2282 + «23S3. (10) 

The following prediction algorithms are described for two predictor 
functions, but they can easily extend to more than two. 

2. 1 Predictor selection schemes 

From a set of predictor functions, the predictor which results in the 
least prediction error for previously transmitted neighboring pels is 
selected as the predictor for the present pel. For each predictor 
function, a decision function w, is defined, which is the sum of the 
amount of the prediction errors for each pel in a small window of 
neighboring pels. The predictor which has the smallest value for the 
decision function is chosen as predictor. This criterion was also used 
by Stuller et al. 6 for gain and displacement compensation. The basic 
selection rule for two predictor functions is as follows: 

s _( f , if Ul £« 2 (11) 

[/ 2 if Mi > U 2 , 

where 

Ui= I |«&-/H«D|. (12) 

ktW 
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The subscript k denotes a pel in a neighborhood W. The window W is 
chosen such that s* is known at the receiver. The decision function ui 
can be evaluated at the receiver without transmission of additional 
information about predictor selection. For previously transmitted near- 
est neighbors, the index set W is 

W a = {1,2,3,4}. (13) 

For real-time implementation the choice of W a creates problems, 
because of the use of the pel s\. The time constraint for calculation of 
Ui can be reduced by using the index set 

W fi = {2, 3, 4, 5} (14) 

or 

Wy- {2,3,4) (15) 

instead of W a . The window W y is also used by Stuller et al. 6 

A further simplification for hardware implementation can be ob- 
tained by introducing a quantizer function Q s ['] in (12). Then the 
decision functions Ui are given by 

*- s Qs [I** -/;(**) i ]. (i6) 

ktW 

A modification which leads to a simpler implementation than the basic 
selection rule (11) can be described as follows. Choose the predictor 
function /j which has within a window W most frequent minimum 
magnitude of the difference 

d ik = s' k - fi(s'k) . (17) 

In the case of two predictor functions at each position k, a binary 
variable Vk which describes which predictor function is better, is 
defined as follows, 

-{J ltl>ISL 

The decision functions m, are now given by 

U\ — J Vh 

ktW 

u 2 = J v k , (19) 

ktW 

where Uk is the complement of Vk . The predictor with smallest value m, 
is chosen. The selection rules as discussed above require that one 
predictor function be chosen even if both decision functions m, are 
identical. An improvement can be obtained by using a "soft-predictor 
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switch," i.e., the prediction value is a weighted sum of predictor 
functions, as given by (4), with the weights 6, being proportional to the 
frequency of preference of the predictor function ft. Hence, for two 
predictor functions, 

6i = i S v k 

b 2 = -l Vh, (20) 

n ktw 

where n is the number of pels in the window W. To avoid the division 
by 3, for W = W a the contribution of the pel at position 3 to (20) is 
doubled, and n is chosen to be 4 for this special case. 

2.2 Adaptive prediction based on a steepest descent method 

The steepest descent 7 is a mathematical method which has been 
often used for optimization. One advantage of this method is its 
simplicity. This method has been used frequently for adaptive systems. 
It is also proposed by Netravali and Robbins 8 and Stuller et al. 6 for 
motion-compensated prediction. Here it will be applied to adaptive 
intra-interframe prediction. 

Let us assume that the prediction value is a weighted sum of 
predictor functions as given by (5). Then the prediction error is given 
by 

e - 8 - b r f . (21) 

In the following, the present pel is denoted by s, rather than s . The 
variance of the prediction error e is a quadratic function in b. 

F(b)=E[(s-b T f) 2 ], (22) 

where E['] is the expected value. The gradient with respect to b is 
given by 

g = VbF(b) = -2E[(s - b T f )f ] ( 23) 

= -2E[ef]. 

The steepest descent is an iterative method, where starting from an 
initial guess the vector b is modified recursively according to, 

jjdn+i) _ t, (m) - y< m) g (m) . (24) 

The adjustment of the vector b (m) is made in the direction of the 
negative gradient. The scalar y (m) has to be optimized by a one- 
dimensional search scheme at each step m. However, real-time appli- 
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cations are performed with a constant value of y. The best value of y 
depends on the type of data. In addition, the value of y influences the 
stability and the speed of convergence of b. 

From eqs. (23) and (24), it follows that an adaptive prediction 
scheme which, based on a gradient method, is given by 

b (m + « = h (m) + 2y Ew[ef] (m \ (25) 

where E w ['] is the expected value within a small window of neighboring 
pels as given by (13), (14), or (15). The coefficient vector b is updated 
on a pel-by-pel basis along the scanning direction, i.e., if b <m+I) is the 
coefficient vector at the present pel, b (m) is the coefficient vector at the 
previous pel. At the beginning of each line, an initial estimate of b is 
used, e.g., the mean of b at the previous line. Simulations indicate that 
because of a fast adjustment an initial vector b with elements 6, = 
l/M, i = 1, 2, • • • M is appropriate. 

In this study, several modifications of the recursion given by (25) 
have been investigated. The various algorithms will be compared with 
respect to prediction gain and cost of implementation. A high predic- 
tion gain requires an appropriate value of y in (25). Simulations with 
several values of y indicate that for video signals with normalized range 
[0, 1] the optimum value of y is about one. In such a case the 
adjustment from pel to pel is relatively small, and the transition from 
one predictor function to another takes several pels. By introducing an 
additional constraint 

lbj=\, (26) 

the value of optimum y is increased to about 64. The increased value 
of y provides a shorter transition from one predictor function to 
another and the constraint (26) improves the stability of the algorithm. 
With the constraint of (26), the steepest descent method has to be 
modified to minimize the augmented function of (22) 

$(b, A) = E[(s - b T f) 2 ] + A(b r o - 1) , (27) 

where o is a vector with all elements equal to 1. The coefficient vector 
b is updated recursively by 

b<m+ i) = b ««) _ y( -2E[ef] {m> + \ (m) o) . (28) 

Using (26) to eliminate A"" 1 from (28), and replacing E['] by EwV], 
then 

b («+D = b («) + 2yC E^ef^ ( 29) 

where C is an M x M matrix given by 
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C = U-^oo r = 

M 



1 


1 


1 


X ~M 


M ' 


M 


1 


1 


1 


~M 


x -m' 


M 


1 
M 


1 

~M ' 


-i 



(30) 



and U is the unit matrix. Because of (26), it follows that 
s-b r f=b T (so-f) = b r d, 



(31) 



where d is a vector of differences similar to (17). This leads to an 
equivalent recursion of (29), given by 



b (m + i) m b (m) _ 2y CEw[ed]. 



(32) 



In the recursions given above, the coefficient vector b at the previous 
pel is updated by an adjustment to obtain the coefficient vector at the 
present pel. However, a picture is two-dimensional in nature, the 
values of b for pels from the previous line in the immediate neighbor- 
hood of the present pel are quite close to that of the present pel. This 
idea results in a modification of (25) which is given below. 

b (m+1) = £w[b] <m) + 2y £wM (m) . (33) 

Let us assume that the samples s and the predictor functions ft are 
represented by 8 bits. In such cases, in the recursions given above at 
each position within the window, a product of two 8-bit numbers has 
to be calculated. A reduction in the cost of implementation can be 
achieved by using the three-level quantizer, shown in Fig. 3, for the 
prediction error e and the differences d. These investigations show 
that a three-level quantizer with a dead zone is more advantageous 
than the signum function used by Netravali and Robbins. 8 

The algorithm (29) and (33) for the case of two predictor functions, 
in combination with a three-level quantizer Q D , results in the following 
recursive scheme, 



&<"+» = Evlbx]™ + yEvlQMQoifi - f 2 )] {m) 
bt +l) = EvAbz\ (m) - yE^Q D (e)Q D (f 1 - / 2 )] (m) , 



(34) 



with the constraints 



6i + b 2 = 1 
0<6i 
0< b 2 . 



(35) 
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Qnlx! 



+1-- 



Fig. 3 — Three-level quantizer for gradient quantization. 

The latter two constraints of (35) were introduced to avoid negative 
weighting coefficients. 

III. VARIABLE LENGTH ENCODING BY HORIZONTAL RUN LENGTH 

An adaptive prediction scheme leads to many predictable pels. A 
pel is described as predictable if its quantized prediction error is 
represented by the level zero. To obtain a low transmission rate, the 
quantized prediction error is coded by a variable length code. There is 
always a loss in mean transmission rate compared to the entropy if not 
all of the negative logarithm of the probability of the prediction error 
representative levels are integer. This loss is especially high if one level 
has a probability much larger than 0.5. For adaptive prediction 
schemes, this is true for the quantizer level zero. To overcome this 
problem, block coding is frequently used. For the application described, 
a special coding scheme is proposed. 

From each frame, a two-level picture is generated which indicates 
where the pels with zero code words (zcw) and where the pels with 
nonzero code words (nzcw) are located. This new picture can be coded 
by known one-dimensional and two-dimensional coding techniques for 
two-level pictures. The nzcws are coded in parallel by a variable- 
length code like a Huffman code and multiplexed with the code words 
of the two-level picture such that the receiver can decide between the 
two types of data. A block diagram of such a coder is shown in Fig. 4. 

For a horizontal run length code, the set of symbols to be coded is 
listed in Fig. 5. For each of the sets, i.e., zero runs (zr), nonzero runs 
(nzr) and nonzero code words (nzcw), a variable length code can be 
determined independently and matched to the probability of the 
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Fig. 4 — Block diagram of a new type of variable length encoding. 

REGULAR SET OF CODE WORDS 



{0.1,2 ,*} 



NEW SETS FOR CODING 



(i) Set of nonzero code words (NZCW) 
{1.2 *} 



(if) Set of zero runs (ZR) 



ZR 

1 

01 
001 
0001 



0+1 



0000 ... 01 
0000 ... 00 



ii) Set of nonzero runs (NZR) 

/ NZR 



1 10 

2 110 

3 1110 



m + 1 



1111... 10 
1111... 11 



Fig. 5 — Set of symbols for horizontal run length coding. 

symbols of that particular set (e.g., Huffman code). The type of runs 
are chosen so as to allow a wrap-around coding from line to line. Wrap- 
around coding means that a run is not terminated at the end of a line 
but continued in the next line. Furthermore, the longest run to be 
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coded could be shorter than one line. The code words must be trans- 
mitted in a sequence so that the receiver always knows which code 
table must be used for decoding. Fig. 6 gives an example in which a zr 
is transmitted at the beginning of a line. In this example, it is also 
assumed that the nzcws are transmitted just after the corresponding 
run. 

The entropy 



H= -S/fclogp,- 



(36) 



is used as an estimate for the mean code word length, withp, being the 
relative frequency of the ith code word derived from the dpcm simu- 
lation of a tv sequence. The variable length code described above 
consists of three independent codes. Hence, the entropy //run in bits 
per sample is given by 

„ ttNZCW „ , «ZR Tr , ZlNZR TT ,__. 

/ZRUN = ilNZCW H ilZR H -TlNZR, (37) 



ttPEL 



ttPEL 



tlpEL 



where n is the number of events specified by the subscript. 

An advantage of the type of run length coding presented here is that 
in the case of statistically independent symbols, the overall entropy is 
not changed (//pel = Hrvn). In the case of interframe coding, the zeros 
and nonzeros are grouped together because they are related to the 
picture content. In this case, a decrease in entropy is achieved by the 
horizontal run length coding. 

IV. SIMULATION RESULTS 

Computer simulations were performed for the prediction algorithms 
given above using three different television sequences. These se- 
quences are the same as those used in Refs. 6 and 8. Each sequence 



LINE OF INPUT CODE WORDS 



00045603200000400000 



ZR 
NZR 



BINARY ZERO-NONZERO PATTERN 



00011101100000100000 (1 



POSSIBLE CODE STRING 

ZR3, NZR2. CW4, CW5, CW6. ZR0, NZR1, CW3, CW2, ZR4, NZR0, CW4, ZR4 

Fig. 6 — Example of a horizontal run length code. 
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consists of 60 frames obtained by sampling a video signal of 1-MHz 
bandwidth, at the Nyquist rate. Each sample was quantized to 8 bits. 
One frame of each sequence is shown in Fig. 7. 

One scene, called Judy, is a head-and-shoulders view of a person 
engaged in active conversation. The second scene, John and Mike, 
shows two people entering the camera field of view and walking briskly 
around each other. The third sequence, Mike and Nadine, is a panned 
view of two people always in view of the camera. 

Even though the quantizer characteristic of a dpcm coder should be 
designed according to the prediction scheme, for simplification in these 
investigations, the same 35-level quantizer shown below was used for 
all simulations. The quantizer has the following positive representative 
levels: 0, 5, 12, 19, 28, 37, 46, 57, 68, 79, 90, 103, 116, 129, 142, 155, 168, 
181. This quantizer was chosen since it gave good picture quality, 
although the quantization error was visible in specific picture areas 
under short viewing distance. The decision levels are always in the 
middle between two succeeding levels. The performance of the predic- 




Fig. 7a One frame out of each sequence — Scene Judy. 
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Fig. 7b One frame out of each sequence — Scene John and Mike. 




Fig. 7c One frame out of each sequence — Scene Mike and Nadine. 
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tion schemes was evaluated by computing the pel entropy, the entropy 
of a horizontal run length code, and the variance of the quantized 

prediction error. 

For comparison of adaptive and nonadaptive schemes, results for 
four nonadaptive predictors were obtained. The nonadaptive predic- 
tion schemes which were used are given below. 

s = s'2o (38) 

8 = S'x - 821 + S20 (39) 

3 2 3 3 2 1 2 

S = - S\ - - S' 2 + - S3 + - S20 - 7 S21 + 7 S22 - T S23 (40) 

4 4 4 4 4 4 4 
l-Irf-frt + f* (4D 

The first predictor (38) is simple previous frame prediction. The 
prediction scheme given by (39) is frequently proposed for interframe 
coding. 5,9 The predictor (40) is a three-dimensional predictor proposed 
by Klie 10 for moving areas of a picture. Equation (41) describes an 
intraframe predictor which minimizes the variance of the prediction 
error. 11 

The results of the nonadaptive predictors are shown in the upper 
part of Tables la, b, and c. These investigations show that previous 
frame prediction (38) is advantageous for sequences with not much 
motion (Judy), while the intraframe predictor (41) and the predictor 
(40) are better for sequences with rapidly moving objects (Mike and 
Nadine). An additional decrease in entropy can be obtained by using 
the horizontal run length coding scheme. This gain is especially high 
(16 percent) for the sequence Judy where zr and nzr are better 
grouped. 



Table la — Entropy per pel and variance of the prediction error for 
nonadaptive and adaptive predictors — Scene Judy. 



Entropy in Bit Per 
Pel 

Variance 



Hpel Hrvn E[e 2 ] Prediction Scheme 



1.035 


0.875 


16.6 


1.120 


0.953 


8.5 


1.349 


1.297 


9.1 


1.840 


1.760 


31.7 


0.838 


0.765 


5.3 


0.781 


0.718 


4.8 



Previous frame, eq. (38) 
2-D Interframe, eq. (39) 
3-D Interframe, eq. (40) 
2-D Intraframe, eq. (41) 
Predictor selection, eq. (11), (12), Wa 
Predictor selection with soft switch eq. (18), 
(20), Wa 
0.783 0.730 4.9 Gradient algorithm, eq. (34), Wa 
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Table lb — Entropy per pel and variance of the prediction error for 
nonadaptive and adaptive prediction — Scene John and Mike. 



Entropy in 


Bit Per 






Pel 




Variance 










HpEL 


Hrun 


E[e 2 ] 


Prediction Scheme 


2.393 


2.190 


142.1 


Previous frame, eq. (38) 


2.400 


2.286 


114.1 


2-D Interframe, eq. (39) 


2.154 


2.094 


61.5 


3-D Interframe, eq. (40) 


2.397 


2.323 


88.9 


2-D Intraframe, eq. (41) 


1.795 


1.711 


39.6 


Predictor selection, eq. (11), (12), Wa 


1.774 


1.687 


36.7 


Predictor selection with soft switch eq. (18), 
(20), Wa 


1.724 


1.629 


34.2 


Gradient algorithm, eq. (34), Wa 



Table Ic — Entropy per pel and variance of the prediction error for 
nonadaptive and adaptive predictors — Scene Mike and Nadine. 



Entropy in 
Pel 


Bit Per 








Variance 










HpEL 


Hrun 


E[e 2 ] 


Prediction Scheme 


2.859 


2.809 


194.9 


Previous frame, eq. (38) 


3.008 


2.982 


250.0 


2-D Interframe, eq. (39) 


2.537 


2.504 


108.0 


3-D Interframe, eq. (40) 


2.546 


2.506 


117.1 


2-D Intraframe, eq. (41) 


2.385 


2.353 


87.4 


Predictor selection, eq. (11), (12), Wa 


2.370 


2.336 


80.8 


Predictor selection with soft switch eq. (18), 
(20), Wa 


2.325 


2.284 


77.2 


Gradient algorithm, eq. (34), Wa 



Adaptive prediction schemes as given in Section II were simulated 
with (38) and (41) as predictor functions. The average bit rate per pel 
for three schemes are shown in the lower part of Tables la, b, and c. 
The adaptive schemes give an additional decrease in entropy if the 
horizontal run length coding technique is used; this improvement 
depends upon the type of picture. 

Compared to the case of simple previous frame prediction, the 
predictor selection in combination with horizontal run length coding 
results in reductions of 18 to 29 percent. The corresponding reductions 
for the more sophisticated gradient method are 20 to 32 percent. The 
minimum and maximum entropy of a single frame within a sequence 
are reduced by about the same amount as the average entropy of the 
sequence. This can be recognized for the gradient method in Fig. 8, 
which shows the entropy per pel of each frame versus frame number. 

In Section II, several modifications of the basic methods, to obtain 
a simpler hardware implementation, were presented. Most of these 
modifications have only a small influence on the entropy. The basic 
predictor selection scheme requires the summations of 8-bit numbers 
for determination of the decision functions (12). A coarse four-level 
quantizer 
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Fig. 8 — Plots of entropy per pel versus frame number for each sequence. Configuration 
one shows the pel entropy /fpei of previous frame prediction; two shows the horizontal 
run length entropy Hrun of previous frame prediction; and three shows the horizontal 
run length entropy Hrun of the gradient algorithm (33) with the constraint (26). (a) 
Scene Judy, (b) Scene John and Mike, (c) Scene Mike and Nadine. 
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0< |x|< 6 
, , 1 6<|x|<18 

'' "^2 18 < |jc| < 36 (42) 

4 36<|x| 

for determination of the decision function (16) increases the entropy 
by about 1 percent. 

The use of a binary variable Vk, equation (18), which indicates which 
predictor function is advantageous at the position k, in combination 
with the soft-switch algorithm of equation (20) is to be preferred. 
Compared to the predictor selection scheme (11), (12), this algorithm 
provides a reduction of up to 7 percent in entropy. In addition, it is 
easier to implement. 

For the gradient method the algorithm (34) which incorporates 
several modifications of the original method is useful concerning cost 
of implementation and the reduction in entropy. The constraint (26) 
is especially advantageous. For the algorithm (34), a three-level quan- 
tizer with thresholds at ±4 was used. The optimum value of y was 
found to be 1/4. Each line started with initial values bi = 1/2 and 
62 = 1/2 for b. As long as the weighting coefficients bj are represented 
by more than 4 bits, the gradient method provides a small gain in 
entropy compared to the predictor selection schemes. 

In these investigations, three windows W a , W p , and W Y were used. 
The window Wp provides results very close to that of W a , whereas W y 
provides an increase of about 2 percent in entropy. 

Further, it was found that using three predictor functions (the 
intraframe predictor is now split into two functions, one for horizontal 
prediction and one for vertical prediction) is not better. Besides the 
intraframe predictor function (39), the predictor function 

f2 = -s\+- s' 3 (43) 

was also used. This resulted in an increase of 4 to 5 percent in the 
entropy. 

It is of interest to know how these adaptive schemes perform in 
comparison with conditional replenishment and displacement compen- 
sation schemes. The results published in Ref. 6 (Table I, page 1235) 
based on the same source data are of some interest in this context. 
Hence, a comparison is possible, but it should be noted that the 35- 
level quantizer used in this investigation is a modification of the one 
used in Refs. 6 and 8. Further in this investigation, an additional 
thresholding of prediction error is not performed. 

Compared to conditional replenishment the adaptive schemes pro- 
vide a reduction in entropy of 19 to 38 percent, depending upon the 
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scene. For active scenes like John and Mike and Mike and Nadine, the 
adaptive schemes provide a data rate close to that of displacement 
compensation (within ±5 percent range). The run length coding 
scheme provides an additional reduction in entropy for sequences with 
low activity. For the sequence Judy, this reduction is 26 percent 
compared to conditional replenishment in case of previous frame 
prediction in combination with run length coding. 

V. CONCLUSION 

The performance of two types of adaptive intra-interframe predic- 
tors in combination with horizontal run length coding was studied. 
The gain in entropy of the predictor selection scheme is nearly as high 
as that of an adaptive scheme which is based on a gradient technique. 
Various modifications of the two basic methods which were investi- 
gated provided only small changes in entropy. Therefore, the adaptive 
algorithm which has the lowest cost of implementation should be 
chosen. 

Further investigations are necessary for the quantizer design and 
the buffer control in a fixed rate system. A combination of the described 
adaptive intra-interframe algorithms with motion compensation will 
result in a more sophisticated system which provides further entropy 
reduction. 
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